Learning Object Detectors with Semi-Annotated Weak Labels

👤 Guoyu Guang, Dingwen Zhang, Junwei Han
📅 July 2019
IEEE TCSVT Journal article

Abstract

For alleviating the human labor associated with annotating the training data for learning object detectors, recent research has focused on semi-supervised object detection (SSOD) and weakly supervised object detection (WSOD) approaches.

In SSOD, instead of annotating all the instances in the whole training set, people only need to annotate the part of the training instances using bounding boxes. In WSOD, people need to annotate the image-level tags on all training images to indicate the object categories contained by the corresponding images since more detailed bounding box annotations are no longer needed.

Along this line of research, this paper makes a further step to alleviate the human labor in annotating training data, leading to the problem of object detection with semi-annotated weak labels (ODSAWLs). Instead of labeling image-level tags on all training images, ODSAWL only needs the image-level tags for a small portion of the training images, and then, the object detectors can be learned from a small portion of the weakly-labeled training images and from the remaining unlabeled training images.

Methodology

To address such a challenging problem, this paper proposes a cross model co-training framework that collaborates an object localizer and a tag generator in an alternative optimization procedure.

Specifically, during the learning procedure, these two (deep) models can transfer the needed knowledge (including labels and visual patterns) between each other. The whole learning procedure is accomplished in a few stages under the guidance of a progressive learning curriculum.

Key Components:

Object Localizer: Learns to detect and localize objects in images

Tag Generator: Generates image-level tags for unlabeled images

Alternative Optimization: The two models collaborate and improve each other iteratively

Progressive Learning Curriculum: Guides the learning procedure through multiple stages

Experimental Results

To demonstrate the effectiveness of the proposed approach, we implement the comprehensive experiments on three benchmark datasets, where the obtained experimental results are quite encouraging.

Notably, by using only about 15% weakly labeled training images, the proposed approach can effectively approach, or even outperform, the state-of-the-art WSOD methods.

This demonstrates that the proposed ODSAWL framework can significantly reduce the annotation burden while maintaining competitive detection performance, making it a practical solution for real-world applications where obtaining full annotations is expensive or time-consuming.

Keywords: Deep Learning Computer vision image processing object detection Semi-Annotated Weak Labels Semi-supervised Learning

📚 Cite This Work

Choose how you would like to access the BibTeX citation: